An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data.

نویسندگان

  • Goo Jun
  • Mary Kate Wing
  • Gonçalo R Abecasis
  • Hyun Min Kang
چکیده

The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A simple and efficient DNA extraction protocol for old herbarium leaves of Bellevalia (Asparagaceae, Scilloideae)

High-quality DNA extraction plays an important role to make sharp bands in the gel electrophoresis and also produces clean chromatograms. Usually, DNA extract is delivered using the modified CTAB method but this method cannot obtain high-quality DNA for molecular analysis from old dried leaves of Bellevalia due to having different chemical compounds which inhibit to obtain a clear DNA extractio...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

An Efficient Method For DNA Extraction From Paraffin Wax Embedded Tissues For PCR Amplification Of Human And Viral DNA

  Background and Objective: Formalin-fixed paraffin-embedded tissues are a valuable source of DNA for molecular studies. We designed and optimized an efficient procedure for DNA extraction from formalin-fixed paraffin embedded tissues. Materials and Methods: Seventy three blocks of cervical paraffin-embedded tissues were investigated. DNA was extracted using 45 minutes boiling in alkaline sol...

متن کامل

An efficient and simple CTAB based method for total genomic DNA isolation from low amounts of aquatic plants leaves with a high level of secondary metabolites

An efficient DNA isolation protocol specifically modified to get pure quality DNA required for molecular studieshas been reported in this paper. Some aquatic plants (Potamogeton spp., Ceratophyllum demersum and Myriophyllum spicatum) were used for the study. The protocol developed will be useful in getting high and pure DNA. Instead of using the available DNA extraction kits, this protocol can ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 25 6  شماره 

صفحات  -

تاریخ انتشار 2015